Using Repeated Patterns across Comparable Articles for Paraphrase Acquisition
ثبت نشده
چکیده
We focus on paraphrases for information extraction: expressions which should produce the same extraction output. These expressions are acquired automatically from comparable news articles (articles from the same day, on the same topic). Candidate paraphrases are paths in predicate argument structure starting from matching anchors (typically, names) in the two sentences. By using such syntactically-regularized structures and limiting ourselves to single paths, we increased the likelihood of observing repeated patterns. We measured the frequency of such candidate patterns over a large corpus, and confirmed a correlation between frequency and their accuracy as paraphrases.
منابع مشابه
Paraphrase Acquisition for Information Extraction
We are trying to find paraphrases from Japanese news articles which can be used for Information Extraction. We focused on the fact that a single event can be reported in more than one article in different ways. However, certain kinds of noun phrases such as names, dates and numbers behave as “anchors” which are unlikely to change across articles. Our key idea is to identify these anchors among ...
متن کاملLarge Scale Acquisition of Paraphrases for Learning Surface Patterns
Paraphrases have proved to be useful in many applications, including Machine Translation, Question Answering, Summarization, and Information Retrieval. Paraphrase acquisition methods that use a single monolingual corpus often produce only syntactic paraphrases. We present a method for obtaining surface paraphrases, using a 150GB (25 billion words) monolingual corpus. Our method achieves an accu...
متن کاملAutomatic Paraphrase Acquisition from News Articles
Paraphrases play an important role in the variety and complexity of natural language documents. However they adds to the difficulty of natural language processing. Here we describe a procedure for obtaining paraphrases from news article. A set of paraphrases can be useful for various kinds of applications. Articles derived from different newspapers can contain paraphrases if they report the sam...
متن کاملInterrogative Reformulation Patterns and Acquisition of Question Paraphrases
We describe a set of paraphrase patterns for questions which we derived from a corpus of questions, and report the result of using them in the automatic recognition of question paraphrases. The aim of our paraphrase patterns is to factor out different syntactic variations of interrogative words, since the interrogative part of a question adds a syntactic superstructure on the sentence part (i.e...
متن کاملUsing Discourse Information for Paraphrase Extraction
Previous work on paraphrase extraction using parallel or comparable corpora has generally not considered the documents’ discourse structure as a useful information source. We propose a novel method for collecting paraphrases relying on the sequential event order in the discourse, using multiple sequence alignment with a semantic similarity measure. We show that adding discourse information boos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005